Statistical singing voice conversion with direct waveform modification based on the spectrum differential

نویسندگان

  • Kazuhiro Kobayashi
  • Tomoki Toda
  • Graham Neubig
  • Sakriani Sakti
  • Satoshi Nakamura
چکیده

This paper presents a novel statistical singing voice conversion (SVC) technique with direct waveform modification based on the spectrum differential that can convert voice timbre of a source singer into that of a target singer without using a vocoder to generate converted singing voice waveforms. SVC makes it possible to convert singing voice characteristics of an arbitrary source singer into those of an arbitrary target singer. However, speech quality of the converted singing voice is significantly degraded compared to that of a natural singing voice due to various factors, such as analysis and modeling errors in the vocoderbased framework. To alleviate this degradation, we propose a statistical conversion process that directly modifies the signal in the waveform domain by estimating the difference in the spectra of the source and target singers’ singing voices. The differential spectral feature is directly estimated using a differential Gaussian mixture model (GMM) that is analytically derived from the traditional GMM used as a conversion model in the conventional SVC. The experimental results demonstrate that the proposed method makes it possible to significantly improve speech quality in the converted singing voice while preserving the conversion accuracy of singer identity compared to the conventional SVC.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Statistical singing voice conversion based on direct waveform modification with global variance

This paper presents techniques to improve the quality of voices generated through statistical singing voice conversion with direct waveform modification based on spectrum differential (DIFFSVC). The DIFFSVC method makes it possible to convert singing voice characteristics of a source singer into those of a target singer without using vocoder-based waveform generation. However, quality of the co...

متن کامل

Improvements of Voice Timbre Control Based on Perceived Age in Singing Voice Conversion

As one of the techniques enabling individual singers to produce the varieties of voice timbre beyond their own physical constraints, a statistical voice timbre control technique based on the perceived age has been developed. In this technique, the perceived age of a singing voice, which is the age of the singer as perceived by the listener, is used as one of the intuitively understandable measu...

متن کامل

The NU-NAIST Voice Conversion System for the Voice Conversion Challenge 2016

This paper presents the NU-NAIST voice conversion (VC) system for the Voice Conversion Challenge 2016 (VCC 2016) developed by a joint team of Nagoya University and Nara Institute of Science and Technology. Statistical VC based on a Gaussian mixture model makes it possible to convert speaker identity of a source speaker’ voice into that of a target speaker by converting several speech parameters...

متن کامل

Articulatory controllable speech modification based on Gaussian mixture models with direct waveform modification using spectrum differential

In our previous work, we have developed a speech modification system capable of manipulating unobserved articulatory movements by sequentially performing speech-to-articulatory inversion mapping and articulatory-to-speech production mapping based on a Gaussian mixture model (GMM)-based statistical feature mapping technique. One of the biggest issues to be addressed in this system is quality deg...

متن کامل

Evaluation of a singing voice conversion method based on many-to-many eigenvoice conversion

In this paper, we evaluate our proposed singing voice conversion method from various perspectives. To enable singers to freely control their voice timbre of singing voice, we have proposed a singing voice conversion method based on many-tomany eigenvoice conversion (EVC) that enables to convert the voice timbre of an arbitrary source singer into that of another arbitrary target singer using a p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014